Representing Letters in Binary (ASCII)

Because everything is represented in bits, global agreed-upon standards are needed for representing letters and characters.

Ex: 65 in decimal is equal to the letter A; or 01000001 = A

Question: How does the computer know when we mean 65 or A?

We determine the context with things like prefixes and file formats.

Problem: ASCII is quite US-centric

Solution: more global standards (see: Unicode)

Measuring Bits

Suppose we send the message: 72 73 33 ("HI!")

Suppose each char is represented with 8 bits, that’s 24 bits to send one message

However, bits are pretty small (physically and mathematically), so we don’t usually measure things in bits

Byte: 8 bits
- The biggest number we can store in a byte is 255 (11111111)

Unicode

256 different unique values for a byte works for English ASCII, but other global standards are needed to support other languages and things like emojis. One solution is unicode.

Unicode: Superset of ASCII that supports a wider variety of characters. Supports 8-bit ASCII for backwards compatibility and 16-bits for >65,000 characters and 32-bits for >4 billion characters.
- Unicode standardizes the description of characters. Manufacturers and companies and font creators and users determine how those characters are displayed.
  - Can lead to miscommunication (e.g., gun vs water gun for the same emoji)